Targeting AAV vectors to the central nervous system by engineering capsid–receptor interactions that enable crossing of the blood–brain barrier

Viruses have evolved the ability to bind and enter cells through interactions with a wide variety of cell macromolecules. We engineered peptide-modified adeno-associated virus (AAV) capsids that transduce the brain through the introduction of de novo interactions with 2 proteins expressed on the mouse blood–brain barrier (BBB), LY6A or LY6C1. The in vivo tropisms of these capsids are predictable as they are dependent on the cell- and strain-specific expression of their target protein. This approach generated hundreds of capsids with dramatically enhanced central nervous system (CNS) tropisms within a single round of screening in vitro and secondary validation in vivo thereby reducing the use of animals in comparison to conventional multi-round in vivo selections. The reproducible and quantitative data derived via this method enabled both saturation mutagenesis and machine learning (ML)-guided exploration of the capsid sequence space. Notably, during our validation process, we determined that nearly all published AAV capsids that were selected for their ability to cross the BBB in mice leverage either the LY6A or LY6C1 protein, which are not present in primates. This work demonstrates that AAV capsids can be directly targeted to specific proteins to generate potent gene delivery vectors with known mechanisms of action and predictable tropisms.

multiple sheets in one excel file that is saved using exactly the following convention: S1_Data.xlsx (using an underscore).
We have renamed all supplementary files to S# Data. Due to the large sizes of our datasets (>1M rows exceed Excel's maximum), it is not possible to put all of these data into Excel file(s).
-Deposition in a publicly available repository. Please also provide the accession code or a reviewer link so that we may view your data before publication.
Regardless of the method selected, please ensure that you provide the individual numerical values that underlie the summary data displayed in the following figure panels as they are essential for readers to assess your analysis and to reproduce it: The library data files required to reproduce all plotted figure panels are too large (298 MB) to include as supplementary data and are therefore deposited on Zenodo at https://doi.org/10.5281/zenodo.7689794. The Supplementary Data files S1-S23 contain analyzed and processed values derived from the data available on Zenodo.
Our Data Availability statement now reads: All code used in this study is available on GitHub: https://github.com/vectorengineering/AAV_capsid_receptor/. All data required to reproduce each plot are available from the Zenodo open repository at https://doi.org/10.5281/zenodo.7689794. These data include RPMs and enrichment values from the libraries used in this study. Supplementary Data files S1-S23 contain analyzed and processed values derived from the data available on Zenodo.
In addition, we provide a comprehensive table pointing reviewers and readers to the supplementary data file(s) required to produce each plot.
(D) Please also ensure that each of the relevant figure legends in your manuscript include information on *WHERE THE UNDERLYING DATA CAN BE FOUND*, and ensure your supplemental data file/s has a legend.
As requested, we have added information on where the underlying data can be found at the end of each relevant figure legend.

(E) Please ensure that your Data Statement in the submission system accurately describes
where your data can be found and is in final format, as it will be published as written there.
We have updated our Data Statement as described above.
(F) Please note that per journal policy, we do not allow the mention of "data not shown" or other references to data that is not publicly available or contained within this manuscript (line 399). Please either remove mention of these data or provide figures presenting the results and the data underlying the figure(s).
We have now removed the sentence in our Methods section originally at line 399: "In most cases, the beads were saturated with Fc-fusion protein as confirmed by the protein gel (data not shown)." This change does not affect the reader's ability to replicate our work.
(G) Please also provide a blurb which (if accepted) will be included in our weekly and monthly Electronic We provide the following blurb for our publication: Directly targeting adeno-associated virus capsids to specific proteins expressed on the blood-brain barrier can rapidly generate potent gene delivery vectors with predictable in vivo tropisms. We thank Reviewer #1 for their assessment of our work as novel, rational, and of interest to the gene therapy community. We agree that in vivo validation remains a major challenge in AAV engineering. The readouts of AAV library screens typically consist of either DNA or mRNA-based enrichment scores. These values from in vivo studies provide a metric by which to rank the function of sequences relative to each other. While these data generally accurately predict the relative in vivo function when capsids are tested individually, the library data does not always perfectly match the ranking observed when tested individually.

REVIEWS
There are multiple hypothetical reasons that could account for these minor discrepancies.
First, there is bias introduced by production fitness, e.g., variants that are rare in the starting library may have inflated enrichment scores. This bias is not present when testing viruses individually. Second, the amount of capsid mRNA produced from individual transduced cells will vary, especially in an organ like the brain where there is a great diversity of cell types.
Despite these potential biases, we show here that the library data can be effectively used to (1) pick individual top hits from the selection and (2) train generative ML models.
We also agree that the outcomes of ML depend on the nature of the training data and that ML should be applied to answer questions that can be feasibly addressed given the type of training data available. In our approach, we use ML to predict new capsids that interact with a target receptor, i.e., our objective is to use the ML as a method for affinity maturation/optimization by expanding the sequence space explored in a second-round library. The value of ML in this application is in optimizing the search for more functional sequences so that it is cost-effective and practical (as compared to performing saturation mutagenesis on every motif identified in first round screening). The training data produced by our receptor-targeted approach is well suited for this purpose without having to precisely score the in vivo performance of variants relative to each other. However, we have a separate manuscript available, Eid et al. 2022 bioRxiv, that addresses the question of how training data and models may be generated that can more accurately rank variants against each other.
All of our screens were performed such that the engineered 7-mers were inserted into all capsids subunits (VP1, VP2, and VP3). Evaluating these novel 7-mers as mosaics with wild type or other engineered capsids is an interesting application that is beyond the scope of this study.

Minor -Should define ML when it first appears on line 71
Thank you for catching this. We have now defined ML where it is first mentioned in our introduction. We thank Reviewer #2 for their appreciation of our exciting findings and their insight that identifying target proteins that can mediate AAV transduction is the challenging and invigorating next step of extending our approach. Using the same approach described in this manuscript, we have developed capsids that engage human proteins present on the CNS endothelium, but testing these capsids in vivo is slowed by the need to generate mice that express the human form of the target receptor gene. We are in the process of characterizing these capsids and believe there are likely dozens of promising human receptors to target gene delivery vectors to disease-relevant cell types. In support of this, researchers in the Gradinaru laboratory at Caltech (Shay et al. bioRxiv, 2023) have also determined that two previously described capsids 9P31 and 9P36 (Nonnenmacher et al. MTMCD, 2020) can use the mouse carbonic anhydrase IV (Car4, CA4 in humans) protein to cross the BBB. These are the same two previously published capsids that we did not identify as LY6A or LY6C1 binding capsids (Fig 2F). The identification of mouse Car4 as a receptor for an engineered AAV is exciting as it extends the types of proteins that can be considered as targets beyond the Ly6 family. Interestingly, Car4/CA4 shares several features with Ly6a and Ly6c1, notably that they are all GPI-anchored membrane proteins that are highly expressed on brain endothelium. Encouragingly, unlike Ly6a and Ly6c1, carbonic anhydrase IV (CA4) is present in rodents and primates. Although the primary sequences of Car4/CA4 diverge significantly across species, and 9P31 and 9P36 do not appear to bind primate forms of the CA4 protein, it may be possible to engineer AAVs that cross the BBB by engaging the human CA4 protein.

Reviewer #2 (Casey
We have now added new text about these recent developments in our discussion: "Recently, Shay et al. have identified Carbonic Anhydrase IV [43] as the cellular protein coopted by the two previously reported mouse BBB-crossing AAVs (9P31 and 9P36) that we reported here do not engage LY6A or LY6C1 [3]. Like LY6A and LY6C1, Carbonic Anhydrase IV is a GPI-anchored protein that is highly expressed on CNS endothelial cells. Notably, this protein is present in both rodents and primates, and may therefore represent a prime target for engineering receptor-targeted AAVs with a predictable mechanism of action for human CNS gene therapy." Comment 2. What were the cellular targets in the brain of the identified capsids (BI48, BI49, BI28, BI62, BI65)? Were any "detargeted" from liver?
Capsids that cross the BBB using LY6A or LY6C1 to enter the mouse CNS, seem to share the ability to transduce neurons, astrocytes, oligodendrocytes, and endothelial cells, although some capsids show partial bias toward specific cell types like endothelial cells and astrocytes (e.g., AAV-PHP.V1). As fewer potent LY6C1 binding capsids have been reported and characterized we chose our overall best performing LY6C1 binding capsid BI28 for further analysis. We now show new data in S9A-C Fig that AAV-BI28 transduces neurons and glia as assessed by colocalization with NeuN, S100, and APC (CC1) positive cells.
AAV-BI28, and LY6C1 binding capsids more generally, are highly effective at transducing astrocytes. Therefore, we also assessed the ability of AAV-BI28 to mediate gene editing in astrocytes throughout the adult brain. For this experiment, we used AAV-BI28 to deliver a dual vector SaCas9- We had speculated in our 2019 publication describing the LY6A-dependent mechanism of AAV-PHP.eB that some favorable characteristics of receptors that could mediate efficient AAV crossing of the BBB might include (1) abundant luminal surface exposure on brain endothelium, (2) localization within lipid micro-domains through GPI anchoring, or (3) specific recycling/intracellular trafficking capabilities. Further support for the use of highly expressed GPI anchored proteins for mediating efficient BBB crossing of engineered AAVs comes from a recent preprint (Shay et al. bioRxiv 2023) that identified mouse Carbonic Anhydrase IV as a receptor for the previously described 9P31 and 9P36 capsids (Nonnenmacher et al. 2021). However, natural AAVs receptor interactions are not restricted to GPI-anchored proteins and include a diverse set of molecules including, AAVR, integrins, and a variety of glycans. We expect to glean additional information about what types of receptors to target through additional screening efforts.

Comment 1. It was interesting to see the AAV-F peptide sequence (FVVGQSY) mediates
Ly6c binding and many of these amino acids (5/7) were observed in some of the pull-down derived capsids (e.g. FVYGQIA, Fig S6) as well as the MDVIA capsid from the Sabeti group (5/7 amino acid identity). This further suggests that this motif is strongly associated with Ly6C binding and should be included in the text (results or discussion).
The in vitro pull down assays identified a large number of sequence motifs, including ones encompassing AAVF, that were capable of mediating interactions with LY6A or LY6C1 that led to dramatically enhanced in vivo CNS transduction. This particular motif was not observed in as many sequences as other motifs, which is why we chose not to highlight it in  The approaches we describe here should make it possible for researchers with the appropriate expertise to identify unique capsids that bind the same target. This is no different than what occurs in the development of other biologics (e.g., antibodies or nanobodies) where numerous researchers and companies have developed antibodies that recognize a common target. Like with these other molecules, capsids that bind a single target can differ substantially in their functional characteristics beyond receptor binding, thus careful and thorough screening is an essential component of finding the capsids that perform best across critical functional metrics. While it is attractive to think that novel AAV capsids with high target cell transduction can be generated with minimal cycling in animal models and at a rapid rate, one potential concern is that binding of specific protein targets may not translate into other steps critical for viral infection such as uptake/cell entry. This aspect should be highlighted as a potential limitation.
Another potential limitation is the species-specific difference often seen in such protein based targets (from a sequence perspective) and the implications for species-selectivity that needs to be discussed. Nevertheless, while the study does not provide any significant new insight into AAV biology (from using previously identified receptors in mice), the incorporation of the SVAE machine learning technique exemplifies the power of this technique in being able to predict infectivity and fitness from a vector production standpoint.
Based on comments from both Reviewers #2 and #3, we have now added wording to our discussion to highlight the challenge of identifying cellular proteins that can act as de novo AAV receptors.
The reviewer is correct that species-specific differences will remain a challenge with this in vitro approach except in cases where the target protein is highly conserved. We point out the encouraging findings recently reported by other groups who have identified proteins that exist in NHPs and humans with the ability to facilitate BBB-crossing of engineered AAVs. It is also important to mention that in vivo AAV selections have, with only a few exceptions, also yielded species-specific capsids. And given that the receptor target is not usually obvious, in vivo selections yield capsids with unknown translational potential.
This manuscript is likely the first of many, from our group and others, that will use in vitro receptor binding assays to rapidly identify gene delivery vectors with known mechanisms of action and predictable species or cross-species tropisms.
Overall, from a methods/resource contribution perspective, the authors provide a detailed validation of this approach, adequate controls (e.g., comparing and contrasting sequences enriched for Ly6A and Ly6C1 binding compared to Fc binding alone) and the ability of the machine learning approach combined with in-life selection as being on par with in life selection alone.

Additional comments:
In figure 4F, the recovered enriched sequences are plotted according to cluster log size and cluster max enrichment. Since the data are available and easily sortable, e.g., in Figure   S13, the authors should consider plotting enrichment of brain transduction for individual sequences within the SVAE and Sat. mut. Libraries, either in a new Figure  We have updated our S13 Fig (this is now S15 due to the addition of two new supplementary figures) to plot the in vitro binding and brain transduction enrichment scores of individual Round 2, saturation mutagenesis, and SVAE variants. The panels show the performance of individual variants for each assay: LY6A-Fc pull-down, LY6C1-Fc pull-down, C57BL/6J mouse brain transduction by LY6A-binding variants, or C57BL/6J mouse brain transduction by LY6C1-binding variants. The plots show that both approaches generated top performers in in vitro target binding and in vivo brain transduction compared to the top hits from the Round 1 screen. are: S12, S16-20 Data; for LY6C1, these are: S13-15, S21-23. We have now cited these data collectively (S12-23 Data) in the caption for Fig 4F and S15 Fig.